11 research outputs found
Conditional Hardness of Earth Mover Distance
The Earth Mover Distance (EMD) between two sets of points A, B subseteq R^d with |A| = |B| is the minimum total Euclidean distance of any perfect matching between A and B. One of its generalizations is asymmetric EMD, which is the minimum total Euclidean distance of any matching of size |A| between sets of points A,B subseteq R^d with |A| <= |B|. The problems of computing EMD and asymmetric EMD are well-studied and have many applications in computer science, some of which also ask for the EMD-optimal matching itself. Unfortunately, all known algorithms require at least quadratic time to compute EMD exactly. Approximation algorithms with nearly linear time complexity in n are known (even for finding approximately optimal matchings), but suffer from exponential dependence on the dimension.
In this paper we show that significant improvements in exact and approximate algorithms for EMD would contradict conjectures in fine-grained complexity. In particular, we prove the following results:
- Under the Orthogonal Vectors Conjecture, there is some c>0 such that EMD in Omega(c^{log^* n}) dimensions cannot be computed in truly subquadratic time.
- Under the Hitting Set Conjecture, for every delta>0, no truly subquadratic time algorithm can find a (1 + 1/n^delta)-approximate EMD matching in omega(log n) dimensions.
- Under the Hitting Set Conjecture, for every eta = 1/omega(log n), no truly subquadratic time algorithm can find a (1 + eta)-approximate asymmetric EMD matching in omega(log n) dimensions
On the Power of Preconditioning in Sparse Linear Regression
Sparse linear regression is a fundamental problem in high-dimensional
statistics, but strikingly little is known about how to efficiently solve it
without restrictive conditions on the design matrix. We consider the
(correlated) random design setting, where the covariates are independently
drawn from a multivariate Gaussian with ,
and seek estimators minimizing ,
where is the -sparse ground truth. Information theoretically, one can
achieve strong error bounds with samples for arbitrary
and ; however, no efficient algorithms are known to match these guarantees
even with samples, without further assumptions on or . As
far as hardness, computational lower bounds are only known with worst-case
design matrices. Random-design instances are known which are hard for the
Lasso, but these instances can generally be solved by Lasso after a simple
change-of-basis (i.e. preconditioning).
In this work, we give upper and lower bounds clarifying the power of
preconditioning in sparse linear regression. First, we show that the
preconditioned Lasso can solve a large class of sparse linear regression
problems nearly optimally: it succeeds whenever the dependency structure of the
covariates, in the sense of the Markov property, has low treewidth -- even if
is highly ill-conditioned. Second, we construct (for the first time)
random-design instances which are provably hard for an optimally preconditioned
Lasso. In fact, we complete our treewidth classification by proving that for
any treewidth- graph, there exists a Gaussian Markov Random Field on this
graph such that the preconditioned Lasso, with any choice of preconditioner,
requires samples to recover -sparse signals when
covariates are drawn from this model.Comment: 73 pages, 5 figure
Provable benefits of score matching
Score matching is an alternative to maximum likelihood (ML) for estimating a
probability distribution parametrized up to a constant of proportionality. By
fitting the ''score'' of the distribution, it sidesteps the need to compute
this constant of proportionality (which is often intractable). While score
matching and variants thereof are popular in practice, precise theoretical
understanding of the benefits and tradeoffs with maximum likelihood -- both
computational and statistical -- are not well understood. In this work, we give
the first example of a natural exponential family of distributions such that
the score matching loss is computationally efficient to optimize, and has a
comparable statistical efficiency to ML, while the ML loss is intractable to
optimize using a gradient-based method. The family consists of exponentials of
polynomials of fixed degree, and our result can be viewed as a continuous
analogue of recent developments in the discrete setting. Precisely, we show:
(1) Designing a zeroth-order or first-order oracle for optimizing the maximum
likelihood loss is NP-hard. (2) Maximum likelihood has a statistical efficiency
polynomial in the ambient dimension and the radius of the parameters of the
family. (3) Minimizing the score matching loss is both computationally and
statistically efficient, with complexity polynomial in the ambient dimension.Comment: 25 Page
Learning in Observable POMDPs, without Computationally Intractable Oracles
Much of reinforcement learning theory is built on top of oracles that are
computationally hard to implement. Specifically for learning near-optimal
policies in Partially Observable Markov Decision Processes (POMDPs), existing
algorithms either need to make strong assumptions about the model dynamics
(e.g. deterministic transitions) or assume access to an oracle for solving a
hard optimistic planning or estimation problem as a subroutine. In this work we
develop the first oracle-free learning algorithm for POMDPs under reasonable
assumptions. Specifically, we give a quasipolynomial-time end-to-end algorithm
for learning in "observable" POMDPs, where observability is the assumption that
well-separated distributions over states induce well-separated distributions
over observations. Our techniques circumvent the more traditional approach of
using the principle of optimism under uncertainty to promote exploration, and
instead give a novel application of barycentric spanners to constructing policy
covers